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Abstract 

An  experiment  was  performed  investigating  the  effect 
of  magnitude  of  reward  and  level  of  acquisition  training 
on  extinction  performance  and  reversal  performance  in  a 
black-white  discrimination  task.  Eighty  rats  (eight  groups) 
were  run  with  six  groups  consitituting  a  2  x  3  factorial 
design  in  which  the  independent  variables  were  two  levels 
of  acquisition  training  (40  or  100  trials)  and  three  levels 
of  reward  magnitude  (2,4  or  8  Noyes  pellets).  These  six 
groups  received  first  acquisition  training,  then  60  extinc¬ 
tion  trials,  and  finally  reversal  training  to  criterion. 

In  addition  to  these  six  groups  there  were  two  groups  who 
had  either  40  or  100  acquisition  trials  with  a  reward  mag¬ 
nitude  of  8  pellets,  and  then  proceded  directly  to  reversal 
training . 

The  results  showed  that,  compared  with  nonovertraining, 
overtraining  led  to  a  greater  degree  of  extinction  and 
faster  reversal  with  large  reward;  with  small  reward  over¬ 
training  led  to  lesser  degree  of  extinction  and  slower 
reversal.  For  large  reward  magnitude  groups,  an  inter¬ 
polated  extinction  period  between  acquisitions  and  reversal 
led  to  faster  reversal  learning.  An  ORE  (overlearning 
reversal  effect)  was  found  only  in  large  reward  magnitude 
groups  that  did  not  have  the  interpolated  extinction 
training . 
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These  findings  indicate  that  speed  of  reversal 
learning  is  related  to  the  level  of  extinction  of  original 
learning,  and  that  extinction  of  original  learning  is  an 
important  process  in  reversal  learning. 

The  results  tend  to  support  the  extinction  analysis 
of  the  ORE. 
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Introduction 

The  overlearning  reversal  effect  (ORE)  refers  to  the 
finding  that  under  some  conditions  the  reversal  of  a 
discrimination  task  (i.e./  learning  to  respond  to  the  form¬ 
erly  negative  stimulus)  is  facilitated  by  additional  trials 
beyond  a  performance  criterion  on  the  original  learning. 

One  of  the  earliest  overtraining  studies  (but  not  over¬ 
training  reversal  studies)  was  reported  by  Jackson  (1932)  . 

He  found  differential  maze  performance  in  rats  due  to  the 
amount  of  prior  training  in  a  different  maze.  According 
to  Jackson  the  transfer  effect  "...  was  large  and 
negative  when  there  had  been  slight  overlearning  on  the 
first  maze,  but  become  positive,  although  small,  when 
overlearning  had  been  carried  further."  The  current  inter¬ 
est  in  the  ORE  phenomenon  was  generated  by  Reid  (1953) 
with  an  experiment  designed  to  bear  on  the  continuity- 
noncontinuity  issue  of  learning.  At  that  time  continuity 
theorists  would  have  predicted  that  overtraining  on  a 
discrimination  task  would  lead  to  a  reverse  ORE,  that  is, 
overtrained  Ss  would  reverse  slower  than  criterion  trained 
Ss.  It  was  reasoned  that  overtraining  would  lead  to  a 
stronger  original  habit  and  greater  interference  in  reversal, 
since  the  original  habit  had  to  be  extinguished  in  reversal. 

On  the  basis  of  Harlow's  (1949)  finding  that  animals 
solve  discrimination  problems  faster  the  more  problems 
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they  have  solved  previously,  Reid  (1953)  predicted  that 
overtraining  on  one  problem  would  have  a  similar  facilit- 
ative  effect  on  reversal  training.  Contrary  to  predict¬ 
ions  from  continuity  theory,  Reid  found  that  rats  given 
150  post-criterion  trials  on  a  black-white  discrimination 
in  a  Y-maze  performed  the  reversal  of  that  discrimination 
in  fewer  trials  than  rats  trained  only  to  a  criterion. 

Reid  attempted  to  place  his  results  within  the  framework 
of  continuity  theory  by  assuming  that  the  original  learn¬ 
ing  involved  not  only  the  attaching  of  specific  habits  to 
positive  and  negative  cues,  but  in  addition  "...  learning 
a  response  of  discriminating,  such  as  a  clear-cut  looking 
at  one  stimulus  card,  looking  at  the  other  stimulus  card, 
and  immediately  making  a  response  to  the  correct  card." 

This  mediating  response  was  a  concept  similar  to  Wykoff's 
'observing  response'  (1952),  and  would  occur,  and  would  be 
reinforced  consistently,  only  after  mastery  of  the 
discrimination  task.  During  reversal  the  overtrained  £s 
have  this  response  which  would  lead  to  continued  'observing' 
of  the  relevant  stimulus  dimension,  and  subsequently 
faster  reversal  compared  to  criterion  trained  £s  who  had 
not  been  responding  consistently  within  the  relevant 
stimulus  dimension. 

Reid's  account  of  the  ORE  was  somewhat  weakened  by 
an  experiment  by  Birch,  Ison  and  Sperling  (1960)  in  which 
an  ORE  based  on  response  latency  was  found  using  the 
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successive  stimulus  presentation  of  a  brightness  discrimin¬ 
ation.  With  successive  stimulus  presentation  the  response 
of  discriminating  could  not  occur  since  on  each  trial  only 
one  stimulus  was  presented.  In  addition,  the  facilitative 
effect  leading  to  the  ORE  appeared  to  be  located  in  the 
extinction  of  the  original  responses  in  the  reversal  stage. 
Birch  et  al  (1960)  found  that  the  overtrained  animals 
reached  the  reversal  criterion  sooner.  The  reversal  criter¬ 
ion  was  defined  as  the  first  day  in  which  all  five  latencies 
to  the  positive  stimulus  were  less  than  the  smallest  latency 
to  the  negative  stimulus.  The  acquisition  rate  of  reversal 
learning,  as  measured  by  latencies  to  the  now-positive 
stimulus  was  similar  for  both  the  overtrained  and  criterion 
trained  groups.  The  overtrained  group  however  extinguished 
faster  (in  terms  of  response  latency)  its  response  to  the 
previously  positive  stimulus.  In  this  study  then,  the 
superior  performance  of  the  overtrained  group  appeared  to  be 
due  to  the  faster  extinction  of  response  tendencies  to  the 
previously  positive  stimulus.  Birch  et  al  pointed  out  that 
this  finding  does  not  necessarily  imply  that  the  response 
of  discriminating  is  not  operative  in  the  simultaneous 
choice  discrimination;  rather  the  study  suggests  that 
other  variables,  such  as  differential  extinction,  also 
determine  the  ORE. 

In  Reid's  analysis  of  the  ORE,  this  effect  is  due 
to  the  non-specific  effect  of  learning  to  discriminate. 
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Another  set  of  experiments  which  Reid's  hypothesis  does 
not  account  for  involve  nonreversal  shifts,  in  which  it 
is  found  that  overtraining  on  the  original  discrimination 
does  not  facilitate  reversal  on  a  previously  presented, 
but  irrelevant  discrimination.  Using  a  Lashley  jumping 
stand  Mackintosh  (1962)  trained  some  £s  on  a  reversal 
shift,  others  on  a  nonreversal  shift.  Overtraining 
facilitated  a  reversal  shift,  but  retarded  a  nonreversal 
shift.  Reid's  analysis  of  the  ORE  would  have  led  to  the 
prediction  that  both  types  of  reversals  would  have  been 
facilitated  with  overtraining. 

Since  Reid's  (1953)  unexpected  results,  a  large  number 
of  ORE  studies  have  appeared,  which  have  been  summarized 
in  several  reviews  (Mackintosh,  1965a;  Paul,  1965;  Sperling, 
1965a,  1965b) .  Although  there  have  been  a  number  of 
studies  which  have  reported  finding  the  ORE  (Birnbaum, 

1964;  Brookshire,  Warren  &  Ball,  1961;  Bruner,  Mandler, 
O'Dowd  &  Wallach,  1958;  Capaldi,  1963  ;  Capaldi  &  Stevenson, 
1957;  Hooper,  1967;  Ison  &  Birch,  1961;  Mackintosh,  1962, 
1963;  North  &  Clayton,  1959;  Pubols,  1956;  Sasaki,  1960; 
Theios  Sc  Blosser,  1965a,  1965b),  there  have  been  many 
studies  which  have  failed  to  find  an  ORE  (Clayton  1963a, 
1963b;  D'Amato  &  Jagoda,  1962;  D'Amato  &  Schiff,  1964; 
Erlebacher,  1963;  Hill,  Spear  &  Clayton,  1962;  Hirayoshi, 

Sc  Warren  1965;  Kendler  &  Kimm,  1964,  1967;  Macktintosh, 

1965;  Theios  &  Blosser,  1965).  Because  of  these 
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inconsistent  results  there  have  been  attempts  to  determine 
the  variables  controlling  the  ORE,  and  to  devise  theoretical 
models  which  predict  the  ORE  in  some  cases,  but  not  in 
others.  Some  of  the  variables  that  have  been  unsuccessfully 
manipulated  in  overtraining  reversal  studies  are  the 
intertrial  intervals  (Capaldi  &  Stevenson,  1957) ,  deprivation 
level  (Bruner  et  al,  1958),  discriminability  of  difference 
between  acquisition  and  reversal  (D'Amato  &  Schiff,  1964),  and 
strain  differences  (D'Amato  &  Schiff,  1964). 

Basically  there  have  been  two  sets  of  models  to  account 
for  the  ORE.  Extinction  hypotheses  have  attempted  to  derive 
the  ORE  on  the  basis  of  faster  extinction  of  the  original 
learning  in  reversal  learning  for  overtrained  Ss .  Attention 
models  have  as  their  main  mechanism  to  account  for  the  ORE 
the  differential  attention  to  the  relevant  stimulus  dimension 
for  overtrained  and  non-overtrained  £s . 

According  to  the  extinction  hypothesis  a  reversal 
experiment  involves  three  stages:  (1)  acquisition  of  the 
discrimination,  (2)  extinction  of  the  discrimination,  and 
(3)  acquisition  of  the  reversal  of  the  discrimination. 

Usually  stages  (2)  and  (3)  overlap.  Overtraining  facilit¬ 
ates  reversal  learning  because  it  shortens  the  extinction 
phase . 

An  early  model  by  Capaldi  and  Stevenson  (1957)  which 
attempted  to  explain  the  ORE  on  the  basis  of  differential 
extinction  hypothesized  that,  compared  to  criterion  training, 
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overtraining  leads  to  an  easier  discrimination  between 
acquisition  and  extinction  and  therefore  faster 
extinction.  Since  in  the  model  reversal  learning 
involves  the  extinction  of  the  original  learning,  the 
variables,  such  as  amount  of  training,  that  result  in 
differential  extinction  performance  should  also  lead  to 
differential  reversal  performance. 

In  Hull's  (1943)  theory,  extinction  was  a  monotonic, 
positive  function  of  the  amount  of  acquisition  training . 
Hullian  S-R  theory  attributed  no  special  motivating  or 
inhibiting  effects  to  nonreinforcement  or  extinction. 
Resistance  to  extinction  was  a  function  of  habit  strength, 
or  operationally,  the  number  of  reinforced  learning  trials. 
Later,  however,  S-R  theorists  like  Amsel  (1958,  1962) 
attributed  motivational  properties  to  the  absence  of  reward 
in  extinction.  According  to  Amsel,  nonreinforcement  leads 
to  frustration,  which  leads  to  the  inhibition  of  previously 
reinforced  responses,  or  extinction.  One  implication  of 
the  frustration  hypothesis  is  that  greater  frustration  during 
experimental  extinction  leads  to  faster  extinction.  Since 
the  amount  of  frustration  is  an  increasing  function  of  the 
strength  of  the  conditioned  anticipatory  goal  responses, 
which  increases  with  the  number  and  size  of  rewards,  the 
frustration  hypothesis  predicts  that  overtraining  leads 
to  more  frustration  and,  therefore,  faster  extinction 
than  criterion  training.  In  support  of  this  hypothesis  a 
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number  of  studies  investigating  the  effect  of  amount  of 
training  on  extinction  performance  in  a  straight  alley 
apparatus  have  shown  that  overtraining  leads  to  faster 
extinction  than  criterion  training  (Capaldi,  1958;  Ison, 

1962;  Wagner,  1963;).  This  overtraining  effect  on 
extinction  performance  appears  to  depend,  at  least  partially, 
on  an  interaction  with  reward  magnitude.  Using  a  straight 
runway  apparatus  Ison  (1962)  found  that  with  large  reward 
overtraining  leads  to  decreased  resistance  to  extinction 
(i.e.,  increased  latencies)  than  nonovertraining  whereas 
with  small  reward  the  reverse  was  observed.  Other  studies 
(Armus,  1959;  Hulse,  1958;  Reynolds  &  Siegel,  1961)  in  which 
acquisition  training  was  not  varied,  have  shown  that  a 
large  magnitude  of  reward  in  acquisition  leads  to  faster 
extinction  than  a  small  reward  magnitude. 

The  implication  for  reversal  learning  of  these 
extinction  studies  is  obvious.  On  the  basis  of  the 
frustration  hypothesis  and  the  extinction  hypothesis  of 
the  ORE  one  would  predict  that  either  an  increase  in 
amount  of  reward  per  trial  or  an  increase  in  the  number 
of  trials  (using  at  least  a  moderate  magnitude  of  reward) 
would  lead  to  better  reversal  performance. 

Theios  and  Blosser  (1965a,  1965b)  have  presented 
evidence  for  a  model  related  to  the  frustration  hypothesis 
which  takes  into  account  both  the  variables  of  amount  of 
training  and  amount  of  reward  in  predicting  extinction 
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and/or  reversal  performance.  The  model  states  that  (a) 
habit  strength  (H)  is  an  increasing  exponential  function 
of  the  number  of  instrumental  responses  in  acquisition , 

(b)  incentive  motivation  (K)  is  an  increasing  exponential 
function  of  the  number  of  appetitively  rewarded  training 
trials,  (c)  the  H-function  reaches  asymptotic  level  before 
the  K-function,  and  (d)  the  asymptote  of  K  is  directly 
related  to  the  magnitude  and  quality  of  reward.  According 
to  the  model’".  •  *  the  expected  number  of  responses  to  an 
extinction  or  reversal  criterion  is  a  linear  function  of 
the  difference  between  H  and  K  at  the  start  of  extinction 
or  reversal  learning."  (Theios  and  Blosser,  1965a).  Since 
with  overtraining  K  increases  while  H  is  asymptotic,  the 
difference  between  H  and  K  decreases  producing  the  usual 
ORE.  Theios  and  Blosser  assert  that  many  studies  which 
have  failed  to  find  an  ORE  can  be  accounted  for  by  this 
model  on  the  basis  of  size  of  reward.  If  rewards  are  small 
then  the  asymptote  of  the  K  function  is  low,  which  would 
result  in  small  differences  along  the  K  function  between 
overtrained  and  criterion  trained  Ss.  Consequently  H-K 
differences  would  be  similar  for  nonovertrained  and  over¬ 
trained  Ss ,  and,  according  to  the  model,  no  ORE  would 
occur.  Although  there  are  exceptions,  the  majority  of  the 
ORE  studies  bear  out  Theios  and  Blosser' s  prediction. 

The  following  studies  have  used  large  rewards  such  as  10-20 
seconds  access  to  wet  mash  or, 150-1. 00  gm.  food  and  have 
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found  an  ORE:  Pubols  (1956) ,  Bruner  et  al  (1958) ,  Komaki 
(1961) ,  Mackintosh  (1962,  1963,  1965b),  Capaldi  (1963), 

Theios  and  Blosser  (1965a,  1965b) ,  and  Hooper  (1967) . 

The  following  studies  using  small  rewards  (less  than  10 
seconds  access  or  less  than  150  mg.  food)  have  failed 
to  find  an  ORE:  D'Amato  and  Jagoda  (1962) ,  Hill  et  al 
(1962) ,  Clayton  (1963) ,  Hill  and  Spear  (1963) ,  Mackintosh 
(1965),  Erlebacher  (1963),  Kendler  and  Kimm  (1964,1967). 

The  success  and  failure  of  finding  the  ORE  in  these 
studies  on  the  basis  of  size  of  reward  support  Theios  and 
Blosser' s  model. 

There  are,  however,  studies  which  have  used  large 
rewards  (as  defined  above)  and  have  failed  to  find  an  ORE: 
(Clayton,  1965;  D'Amato  and  Schiff,  1964;  Kendler  and  Kim, 
1964,1967),  and  studies  which  have  used  small  rewards  and 
have  reported  an  ORE  (Ison,  Birch  and  Sperling,  1961;  North 
and  Clayton,  1959) .  The  exceptions  to  the  expectations  of 
the  Theios  and  Blosser  model  suggest  that  variables  other 
than  the  magnitude  of  reward  interact  with  amount  of  train¬ 
ing  to  determine  reversal  performance. 

Sutherland  (1964)  and  Mackintosh  (1965a)  have  proposed 
an  attention  model  of  discrimination  learning  in  which  they 
attempt  to  account  for  the  ORE  in  terms  of  attention.  Their 
concept  of  attention  is  similar  to  Reid's  (1953)  discriminat¬ 
ing  response  in  that  both  concepts  involve  the  idea  that  the 
animals  learn  about  the  cues  in  their  environment.  The 
concept  of  attention  differs  from  Reid's  discrimination 
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response  in  that  it  is  proposed  that  the  animal's  learning 

about  the  cues  in  its  environment  is  selective,  depending 

on  the  amount  of  attention  focused  on  it  whereas  for  Reid 

it  is  nonselective .  According  to  Mackintosh  (1965a)  the 

concept  of  attention  concerns  the  selective  or  filtering 

processes  in  perception. 

Animals  (particularly  lower  animals)  have 
nervous  systems  of  limited  size  and  there¬ 
fore  of  limited  capacity  for  processing 
and  storing  information.  Thus  they  are 
confronted  with  the  problem  of  selection. 

At  some  stage  they  must  discard  irrelevant 
or  redundant  information  so  as  not  to 
interfere  with  the  storage  of  important 
information.  This  line  of  argument  would 
seem  to  provide  a  general  rationale  for 
postulating,  as  Broadhurst  does,  the  existence 
of  filtering  devices  in  the  nervous  system. 

(Mackintosh,  1965,  p.  124) 

Sutherland  (1964)  and  Mackintosh  (1965a)  have  proposed  a 
two-stage  model  similar  to  Reid's  where  in  order  to  solve 
a  discrimination  problem  the  S.  has  to  (1)  learn  to  attend 
to  the  relevant  stimulus  dimension  (e.g.  brightness  of  the 
stimulus  material)  and  (2)  establish  appropriate  choice 
responses.  The  ORE  is  accounted  for  in  this  model  by 
assuming  that  the  S  is  still  learning  to  attend  with 
post-criterion  trials,  when  the  learning  of  the  choice 
responses  has  reached  asymptotic  level.  The  overtraining 
trials  strengthen  the  response  of  attending  to  the  relevant 
stimulus  dimension  and  thereby  increase  its  resistance  to 
extinction.  By  consistently  attending  to  the  relevant 
stimulus  dimension  in  early  reversal  learning  the  overtrained 
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£  establishes  appropriate  choice  responses  while  still 

attending  to  the  effective  stimuli  (S+  and  S") . 

Nonovertrained  £s  on  the  other  hand  quickly  extinguish 

attending  to  the  relevant  stimulus  dimensions  and  attend 

to  other  stimulus  dimensions  before  reverting  to  the 

relevant  ones.  Mackintosh  describes  the  two  processes 

of  discrimination  learning  in  terms  of  "switching  in" 

and  "switching  out  analyzers." 

First  an  analyzer  specific  to  the  relevant 
stimulus  dimension  is  switched  in,  and 
secondly  approach  and  avoidance  responses 
are  attached  to  the  outputs  of  the  analyzer 
representing  the  positive  and  negative 
stimuli.  Overtraining  is  assumed  to  have 
the  effect  of  strengthening  the  first  process 
so  that  the  analyzer  remains  switched  in  after 
the  overt  choice  responses  have  been  exting¬ 
uished,  thus  ensuring  that  the  problem  will 
still  be  solved  in  terms  of  the  relevant 
dimension.  (Mackintosh,  1963,  p.  127-128) 

Within  the  framework  of  the  attention  model  several 

variables  have  been  found  to  be  determinants  of  reversal 

performance,  indicating  that  reversal  performance  is  not 

only  a  function  of  amount  of  training  and  reward.  Some 

of  these  variables  are  the  predominance  of  the  relevant 

cues  within  the  £s  discriminative  repertoire  (Mackintosh, 

1965b) ,  phylogenetic  level  (Mackintosh,  1965b) ,  presence 

of  irrelevant  cues  (Mackintosh,  1963),  and  type  of  reversal 

shift  (Mackintosh,  1962) 0 

A  major  assumption  of  the  model  is  that  the  facilita- 
tive  effect  of  overtraining  leading  to  the  ORE  occurs  in 
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the  acquisition  phase  and  is  not  due  to  faster  extinction 
of  the  original  discrimination.  According  to  Mackintosh 
(1965a)  "Overtraining  facilitates  reversal  of  a  simultaneous 
visual  discrimination  not  because  of ,  but  in  spite  of,  its 
effect  on  extinction."  (p.131).  He  states  that  the  ORE 
occurs  not  because  of  faster  extinction,  but  in  spite  of 
increased  resistance  to  extinction.  Although  there  are 
studies  in  which  overtraining  has  resulted  in  increased 
resistance  to  extinction  (e .g .,  D 'Amato  &  Jagoda  1962), 
there  are  many  studies  in  which  overtraining  has  resulted 
in  decreased  resistance  to  extinction.  Mackintosh  recognizes 
that  in  runway  studies  (e.g.^Hill  &  Spear,  1963;  Ison, 

1962;  Wagner,  1963),  overtraining  has  been  found  to  reduce 
resistance  to  extinction,  when  this  resistance  to  extinction 
is  measured  by  running  speed  in  the  runway.  He  asserts 
however  that  in  discriminative  learning,  overtraining  leads 
to  increased  resistance  to  extinction,  when  resistance  to 
extinction  is  measured  by  the  number  of  trials  in  reversal 
in  which  the  animals  continue  to  select  the  originally 
positive  stimulus  (perseverative  errors) ,  or  when  it  is 
measured  by  the  number  of  trials  to  an  extinction  criterion. 

In  three  discrimination  studies  (D'Amato,  Schiff  &  Jagoda, 
1962;  Mackintosh,  1962,  1963)  overtraining  led  to  more 
perseverative  errors  or  trials  to  an  extinction  criterion 
(i.e.  increased  resistance  to  extinction)  than  nonovertraining 
in  two  of  these  three  studies  (Mackintosh,  1962,  1963), 
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but  not  in  the  other,  overtraining  also  led  to  faster 
reversal  learning.  However,  in  two  other  discrimination 
studies  (Kendler  &  Kimm,  1964,  1967)  overtraining  (with 
large,  but  not  with  small  reward)  led  to  fewer  perseverative 
errors  than  criterion  training  (i . e decreased  resistance 
to  extinction) ,  and  in  another  study  (Birch,  Allison  & 

House,  1963)  amount  of  acquisition  training  did  not  affect 
number  of  perseverative  errors.  Latency  measures  during 
the  experimental  extinction  of  a  discrimination  have  been 
found  to  be  larger  (decreased  resistance  to  extinction)  for 
overtrained  than  for  nonovertrained  Ss  (Mackintosh,  1963). 
These  six  studies  show  that  in  discrimination  learning 
overtraining  has  not  consistently  led  to  increased  or 
decreased  resistance  to  extinction,  when  resistance  to 
extinction  is  measured  by  trials  to  an  extinction  criterion, 
perseverative  errors,  or  by  latencies. 

Clearly,  the  attention  model  differs  from  the  extinction 
hypothesis  as  to  the  locus  of  the  facilitative  effect  of 
overtraining  in  subsequent  reversal  learning.  The  locus  of 
the  facilitative  effect  for  the  attention  model  is  in  the 
acquisition  phase,  while  the  locus  for  the  extinction 
hypothesis  is  in  the  extinction  (of  the  original  learning) 
part  of  reversal  learning. 

The  present  experiment  was  designed  within  the  frame¬ 
work  of  the  extinction  hypothesis  of  the  ORE.  This 
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experiment  provides  a  test  of  whether  a  critical  determinant 
of  the  ORE  is  the  extinction  of  original  learning  in  reversal 
learning . 

If  an  ORE  is  obtained  because  the  situation  is  such  that 
the  original  response  extinguishes  more  quickly,  then  an 
ORE  should  occur  in  Ss  manipulated  in  acquisition  by  conditions 
which  are  known  to  lead  to  faster  extinction.  The  effects  of 
such  manipulations  (i.e.,  overtraining  and  large  reward 
magnitudes)  should  be  similar  both  in  extinction  behaviour 
for  Ss  who  have  an  extinction  phase  between  acquisition  and 
reversal,  and  in  reversal  behaviour  for  Ss  who  do  not  have  an 
extinction  phase.  An  ORE  in  the  usual  sense  should  occur  in 
reversal  for  large  reward,  nonextinction  phase  Ss;  an  OEE 
(over-learning  extinction  effect  -  i.e.,  faster  extinction  for 
overtrained  animals)  should  occur  for  large  reward  £s  in  the 
interpolated  extinction  period,  followed  by  an  ORE  in 
reversal.  If  such  data  are  obtained,  the  extinction  hypothesis 
of  the  ORE,  rather  than  the  attention  model  explanation  of  the 
ORE,  will  be  supported.  Further  support  for  the  extinction 
hypothesis  of  the  ORE  would  be  obtained  if  Ss  with  an  inter¬ 
polated  extinction  period  reverse  more  easily  than  Ss  with¬ 
out  an  interpolated  extinction  period.  The  specific  hypotheses 
that  were  tested  were: 

In  extinction 

1.  Larger  latencies  and  more  errors  were  predicted  for 
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larger  acquisition  magnitudes  of  reward. 

2.  Larger  latencies  and  more  errors  were  predicted  with 
overtraining,  in  large  reward  groups,  but  not  in  small 
reward  groups. 

In  reversal 

1.  Smaller  latencies  and  fewer  trials  to  criterion  were 
predicted  for  the  larger  acquisition  magnitudes  of  reward. 

2.  Smaller  latencies  and  fewer  trials  to  criterion  were 
predicted  with  overtraining,  than  without  overtraining,  in 
large  reward  groups,  but  not  in  small  reward  groups. 

3.  Smaller  latencies  and  fewer  trials  to  criterions  were 
predicted  for  groups  with  an  interpolated  extinction  period, 
than  for  groups  without  such  a  period. 
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Method 

Subjects 

The  subjects  were  80  male  albino  rats  of  the  Sprague- 
Dawley  strain.  They  were  about  70  days  old  at  the  beginning 
of  the  experiment.  During  the  experiment  one  £  of  group 
100-4E  was  discarded  when  it  became  unmanageable  on  the 
seventh  day  of  extinction. 

Design 

The  design  was  basically  a  2  x  3  factorial  in  which 
the  independent  variables  were  two  levels  of  acquisition 
training  (40  or  100  trials)  and  three  levels  of  reward 
magnitude  (2,4,  or  8  45mg.  Noyes  pellets).  These  six 
groups  (40-2E,  40-4E,  40-8E,  100-2E,  100-4E,  100-8E)  received 
first  40  or  100  acquisition  trials,  then  60  extinction  trials, 
and  finally  reversal  training  to  criterion.  In  addition 
to  these  six  groups  there  were  two  groups  (40-8NE  and  100- 
8NE)  who  had  either  40  or  100  acquisition  trials  with  a 
reward  magnitude  of  8  pellets,  and  then  proceeded  directly 
to  reversal  training.  Each  of  the  eight  groups  had  10 
randomly  assigned  Ss»  Assignment  to  groups  was  done  on 
the  basis  of  a  table  of  random  numbers.  The  eight  groups 
are  shown  in  Table  1. 
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Table  1 

The  experimental  design 


Reward  Magnitude 

Extinction 

No  extinction 

2 

4 

8 

8 

40 

40-2E 

40-4E 

4  0-8E 

40-8NE 

Trials 

100 

100-2E 

100-4E 

100-8E 

100-8NE 

The  dependent  variables  used  were  starting  and  running 
latencies  in  acquisition  and  extinction,  number  of 
errors  in  acquisition  and  extinction,  trials  to  criterion 
in  reversal,  and  starting  and  running  latencies  for  the 
first  10  trials  of  reversal. 

Apparatus 

The  experiment  was  conducted  in  a  room  adjacent 
to  the  animals'  living  quarters.  Temperature  and  humidity 
were  kept  relatively  constant  and  were  approximately  the 
same  level  in  both  rooms.  Illumination  in  the  experimental 
room  was  provided  by  a  80  watt  fluorescent  light  6  feet 
above  the  apparatus. 

Pretraining .  The  apparatus  used  for  pretraining  was 


a  straight  alley  which  was  4"  wide,  5V'  high,  and  55"  long. 
It  was  constructed  of  wood,  painted  medium  grey,  and  covered 
with  plexiglass.  The  alley  could  be  divided  into  5 
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compartments  by  plexiglass  guillotine  doors.  All  except 
the  first  compartment  (start  box)  contained  a  food  cup. 

Training .  The  discrimination  apparatus  was  a  single 
unit  T-maze.  The  stem  of  the  maze  was  25"  long.  The 
first  7"  of  the  stem  comprised  the  start-box,  which  could 
be  closed  off  from  the  rest  of  the  stem  by  a  guillotine 
door.  The  arms  of  the  maze  were  20"  long,  of  which  the 
last  10"  comprised  the  goal-boxes.  Each  goal-box  could 
be  separated  from  the  rest  of  the  arm  by  guillotine  doors. 

A  further  guillotine  door  was  located  at  the  end  of  the 
stem  to  prevent  retracing  into  the  stem  after  entering 
one  of  the  arms.  The  entire  maze  was  4"  wide  and  5"  high, 
and  was  covered  with  meshed  wire,  except  for  the  start  and 
goal  boxes,  which  were  covered  with  plexiglass.  The  stem 
of  the  maze  was  painted  grey.  Black  and  white  inserts  were 
constructed  of  V'  plywood,  which  could  be  put  in  the  arms 
of  the  maze,  allowing  for  the  spatial  distribution  of  the 
discrimination  stimuli  (black-white).  The  inserts  covered 
the  walls  and  the  floor  of  the  arms  and  could  partly  be 
seen  while  the  £  was  still  in  the  stem  portion.  A  timer  that 
was  started  by  the  raising  of  the  start  door  and  stopped 
by  a  photoelectric  cell  located  6"  beyond  the  door  measured 
starting  latency.  A  second  tiftier  that  was  started  by  the  first 
photocell  and  stopped  by  photocells  located  2"  before  the 
goal  boxes  measured  combined  running  and  choice  latencies. 


. 

'  *  5 
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Procedure 

Animals  were  obtained  from  the  supplier  when  they  were 
60-65  days  old  and  weighed  between  175  and  200  grams. 

They  were  housed  two  to  a  cage  and  were  put  on  ad  lib  food 
and  water.  Food  deprivation  began  three  days  after 
arrival  and  five  days  before  the  pretraining.  Each  animal 
was  handled  every  day  and  weighed  every  other  day  through¬ 
out  the  experiment.  During  the  experiment  the  differential 
reinforcement  was  taken  into  account  in  determining  the 
food  rations  of  the  animals.  The  animals'  daily  food  ration 
including  rewards  in  the  maze  was  10  grams.  Each  animal 
was  run  at  the  same  time  of  day,  within  half  an  hour, 
throughout  the  experiment.  Twenty  minutes  after  each  day's 
trials,  the  Ss  were  given  their  daily  food  rations.  When 
they  had  completed  the  experiment  eight  animals  from  different 
groups  were  given  free  access  to  food  and  their  ad  lib 
weight  was  determined  10  days  later.  Comparison  of 
deprivation  and  ad  lib  weight  showed  that  the  deprivation 
weight  was  86%  of  the  ad  lib  weight. 

Pretraining .  Each  £  was  given  1  trial  daily  for  3 
days  in  the  pellet  training  alley,  in  which  a  trial  was 
defined  as  progressing  from  the  start  box  to  the  fifth 
(last)  goal  box.  The  animal  was  put  in  the  start  box  and 
the  first  door  was  raised  and  lowered  after  the  £  had 
entered  the  second  compartment.  After  the  S  had  eaten  the 
two  pellets  in  the  second  compartment  it  was  allowed  to 
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progress  to  the  next  compartment,  and  so  on  to  the  fifth 
compartment.  On  the  fourth  day  of  pretraining,  each  £ 
was  given  3  free-choice  test  trials  in  the  T-maze.  On 
the  basis  of  these  trials  the  animals'  nonpreferred 
brightness  (black  or  white)  was  determined,  which  became 
the  positive  stimulus. 

Training.  The  Ss  were  given  5  trials  daily  in  all 
stages  of  the  experiment.  The  intertrial  interval  of  about 
20  minutes  were  spent  in  the  Ss  home  cages.  The  Ss  were 
given  40  or  100  acquisition  trials  on  the  brightness 
discrimination.  All  groups,  except  groups  40-8NE  and 
100-8NE,  then  received  60  extinction  trials,  followed  by 
reversal  training,  while  groups  40-8NE  and  100-8NE  proceded 
directly  from  acquisition  to  reversal.  The  criterion  for 
reversal  learning  was  9  out  of  10  consecutive  trials  correct. 
In  acquisition  and  reversal  a  correction  procedure  was 
used  allowing  the  S  to  retrace  in  the  arms  of  the  maze  if 
he  made  the  wrong  choice.  when  the  S  had  entered  the  correct 
goal  box,  the  goal  box  door  was  closed.  As  soon  as  the  £ 
had  eaten  the  reward,  it  was  removed  from  the  goal  box  to 
the  home  cage.  Reversal  procedures  were  identical  to 
acquisition  procedures  except  that  in  reversal  the  opposite 
brightness  stimulus  was  rewarded.  Extinction  procedures 
were  the  same  as  acquisition  procedures  except  that  in 
extinction  there  was  no  reward,  the  S  was  not  allowed  to 
retrace  once  it  had  entered  either  goal  box,  and  it  was 
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removed  from  the  maze  if  it  failed  to  enter  a  goal  box 
within  90  seconds  of  raising  the  start  door.  Such  responses 
were  recorded  as  time  errors. 

The  position  of  the  black  and  white  discriminanda 
on  any  one  trial  throughout  the  experiment  was  determined 
by  a  random  order.  The  restrictions  on  this  random  order 
were  that  on  any  10  trials  (2  days)  the  number  of  times 
the  positive  and  negative  stimuli  appeared  on  either  side 
was  5,  and  that  no  more  than  2  consecutive  trials  would 
be  rewarded  on  the  same  side. 
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Results 

The  results  are  divided  into  four  parts:  acquisition, 
overtraining,  extinction,  and  reversal. 

Acquisition.  The  acquisition  data  were  analyzed  in 
terms  of  starting  and  running  latencies  for  40  acquisition 
trials  and  in  terms  of  number  of  correct  daily  trials. 

A  repeated  measures  analysis  of  variance  of  correct  daily 
trials  yielded  a  significant  effect  due  to  days,  magnitude 
of  reward,  and  a  significant  days  x  reward  interaction, 
indicating  that  the  form  of  the  acquisition  curves  differed 
for  the  three  magnitudes  of  reward.  A  trend  analysis  of  the 
linear  and  quadratic  components  of  the  days  x  reward  inter¬ 
action  showed  that  the  form  of  the  group  curves  differed 
both  linearly  and  quadratically .  A  summary  of  the  analysis 
of  variance  is  presented  in  Table  2.  Figure  1  shows  the 
acquisition  performance  of  the  four  reward  groups  in  terms 
of  number  of  correct  daily  trials.  It  can  be  seen  that  large 
reward  (8)  Ss  learned  the  discrimination  fastest,  medium 
reward  (4)  Ss  were  intermediate,  and  small  reward  (2)  Ss 
learned  the  discrimination  slowest.  Repeated  measures 
analyses  of  variance  of  starting  latencies  and  running 
latencies  in  acquisition  (See  Table  3  and  4)  indicated  a 
significant  effect  only  due  to  days;  the  latencies  decreased 
with  acquisition  days. 

Overtraining .  A  repeated  measures  analysis  of  variance 


of  errors  in  overtraining  is  summarized  in  Table  5  and 


' 

■ 
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TABLE  2 


Trend  analysis  of  variance  of  daily 
correct  trials  in  acquisition  for  the 
three  reward  groups 


Source 

SS 

df 

MS 

F 

P 

Reward  (R) 

46.62 

2 

23.31 

10.89 

.001 

Sub j .  w.  groups 

121.98 

57 

2.14 

Days  (D) 

303.47 

7 

43.35 

48.73 

.001 

D  x  R 

25.61 

14 

1.83 

2.06 

.01 

D  x  Sub j .  w.  groups 

354.93 

399 

.89 

D  x  R  (linear) 

8.31 

2 

4.15 

4.66 

.01 

D  x  (quadratic) 

10.30 

2 

5.15 

5.79 

.01 

D  x  R  (cubic) 

1.60 

2 

.80 

.90 

— 

■  , 
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TABLE  3 


Repeated  measures  analysis  of  variance  of  starting 
latencies  for  the  three  reward  groups  in  acquisition 


Source 


SS  df  MS  F  P 


Reward  (R) 

Sub j .  w.  groups 
Days  (D) 

R  x  D 

D  x  Sub j .  w.  groups 


734.93  2 

14504.56  57 
41331.95  7 

2326.48  14 

55224.56  399 


367.47  1.44  - 

254.47 

5904.56  42.66  .001 

166.18  1.20  - 

138.41 


: 
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TABLE  4 


Repeated  measures  analysis  of  variance  of 
running  latencies  in  acquisition  (for  three 
reward  groups  only) 


Source 

SS 

df 

MS 

F 

P 

Reward  (R) 

2.01 

2 

1.01 

.06 

—  —  — 

Sub j .  w.  groups 

939.60 

57 

16.48 

Days  (D) 

1735.35 

7 

14.72 

21. 

18  .001 

D  x  R 

206.04 

14 

14.70 

1. 

25  - 

D  x  Subj .  w.  groups 

4669.95 

399 

TABLE 

5 

Repeated  measures 
errors 

analysis  of  variance  for 
in  overtraining 

Source 

SS 

df 

MS 

F  P 

Reward  (R) 

2.45 

3 

.82 

1.90  - 

Subj.  w.  groups 

15.49 

36 

.43 

Days  (D) 

1.82 

11 

.17 

1.58  - 

D  x  R 

3.52 

33 

.11 

1.02  - 

D  x  Subj.  w.  groups 

41.41 

396 

.11 
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DAYS  (5  TRIAL  BLOCKS) 


FIG.  1.  Mean  daily  correct  trials  in  acquisition 
for  three  reward  magnitudes  (groups  40-2E  and  100-2E; 
40-4E  and  100-4E;  40-8E  and  100-8E) . 
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indicates  that  there  was  no  significant  effect  due  to  amount 
of  reward,  days,  or  the  days  x  reward  interactions.  The 
average  percentage  of  errors  during  overtraining  for  the  small, 
medium,  and  large  reward  groups  were  4.5%,  2.1%,  and  2%, 
respectively,  indicating  that  the  error  rate  in  overtraining 
was  quite  low,  and  that  the  difference  between  reward  groups 
was  small  (see  also  Figure  2  ) .  The  non-significant  days 
effect  and  days  x  reward  interaction  indicates  that  performance 
of  the  groups  had  reached  an  asymptotic  level  at  the  beginning 
of  overtraining. 

Extinction.  Extinction  performance  was  analyzed  in  terms 
of  starting  and  running  latencies,  and  in  terms  of  error  scores 
(i . e v  responding  to  the  stimulus  that  was  negative  in  acqui¬ 
sition)  .  Repeated  measures  analyses  of  variance  presented 
in  Tables  6  and  7  showed  that  both  latencies  increased  over 
days  (p<.001).  For  starting  latencies  there  was  a  significant 
effect  due  to  the  number  of  acquisition  trials  (40  and  100, 
p<.03)  and  to  acquisition  reward  magnitude  (p<.01).  Over¬ 
training  and  large  and  medium  reward  led  to  larger  starting 
latencies  (i.e.,  faster  extinction),  as  can  be  seen  graphically 
in  Figures  3  and  4.  Figure  4  shows  that  the  effect  begins 
quite  early  for  medium  and  large  reward  groups,  which  remain 
different  from  the  small  reward  group  throughout  the  extinction 
period.  There  were  no  significant  effects  in  running  times, 
except  for  the  days  effect,  indicating  that  the  running  latencies 
increased  similarly  for  all  groups. 
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TABLE  6 

Repeated  measures  analysis  of  variance 
of  starting  latencies  (in  seconds)  in  extinction 


Source 

SS 

df 

MS 

F 

=*= 

P 

Training  (T) 

1534.89 

1 

1534.89 

4.95 

.03 

Reward  (R) 

2725.96 

2 

1362.98 

4.39 

.01 

R  x  T 

906.44 

2 

453.22 

1.46 

— 

Sub j .  w.  groups 

16755.63 

54 

310.29 

Days  (D) 

5935.41 

11 

593.58 

4.45 

.001 

D  x  T 

2306.58 

11 

209.69 

1.73 

— 

D  x  R 

1445.43 

22 

65.70 

.54 

— 

D  x  T  x  R 

2374.86 

22 

107.95 

.89 

— 

D  x  Sub j .  w.  groups 

71984.00 

594 

121.18 
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TABLE  7 


Repeated  measures  analysis  of  variance  of 


running  latencies 

(in  seconds) 

in 

extinction 

• 

Source 

SS 

df 

MS 

F 

P 

Training  (T) 

1195.62 

1 

1195.62 

.08 

— 

Reward  (R) 

69302.68 

2 

34651.34 

2.41 

— 

R  x  T 

34722.20 

2 

17361.10 

1.21 

— 

Subj.  w.  groups 

776735.55 

54 

14383 . 99 

Days  (D) 

287213.16 

11 

26110.28 

17.23 

.001 

D  x  T 

16617.47 

11 

1510.92 

1.00 

— 

D  x  R 

40940.34 

22 

1860.92 

1.23 

— 

D  x  T  x  R 

29626.77 

22 

1346.67 

.89 

— 

D  x  Subj.  w.  groups 

900187.75 

594 

1515.46 

30 


DAYS  (5  TRIAL  BLOCKS) 


FIG.  2.  Total  daily  errors  in  overtraining  for 
three  reward  groups  (for  each  group,  N  =  10). 


MEAN  STARTING  LATE  N  C  Y  (SECS.) 
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FIG.  3.  Mean  starting  latency  in  extinction  for 
three  reward  magnitudes  as  a  function  of  training  level. 


MEAN  STARTING  LATENCY  (IN  SECONDS) 
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FIG.  4.  Mean  starting  latency  in  extinction  for 
three  reward  magnitudes  over  days. 
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A  summary  of  the  repeated  measures  analysis  of  variance 
of  error  scores  in  extinction  is  presented  in  Table  8,  and 
means  and  standard  deviations  are  presented  in  Table  9. 

The  only  significant  main  effect  of  days  indicates  that  there 
was  an  increase  in  mean  errors  over  days  (p  <.001)  The 
significant  training  x  reward  interaction  (p <  .02)  indicates 

"v 

that  the  number  of  errors  in  extinction  was  a  function  of 
the  interaction  of  acquisition  reward  magnitude  and  criterion 
versus  overtraining.  After  40  trials  the  number  of  errors 
was  inversely  related  to  magnitude  of  reward,  while  after 
100  trials  errors  were  directly  related  to  magnitude  of 
reward  (see  Figure  5) .  The  significant  days  x  training 
(p <  .03)  and  days  x  reward  interaction  (p  <  .02)  indicates 
that  the  form  of  the  error  curves  in  extinction  for  the  two 
acquisition  training  groups,  and  for  the  three  acquisition 
reward  magnitude  groups  differed  significantly.  Figure  6 
and  Figure  7  show  however,  that  none  of  the  three  reward 
groups,  and  neither  of  the  two  training  groups  were  con¬ 
sistently  superior,  as  is  of  course  indicated  by  the 
significance  of  the  main  effects  of  the  reward  and  training. 

Another  analysis  of  extinction  errors  which  yielded  the 
same  results  as  the  preceding  analysis  involved  an  analysis 
of  the  Ss*  tendency  to  equalize  his  responses  to  the  positive 
and  negative  stimuli.  This  tendency  was  evaluated  by  obtaining 
scores  of  deviation  from  chance  responding  (i.e.,  50%  to 
either  stimulus) .  The  logic  for  this  measure  is  that  extinction 
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TABLE  8 

Repeated  measures  analysis  of  variance 
of  errors  in  extinction 


Source 

SS 

df 

MS 

F 

P 

Training  (T) 

.73 

1 

.73 

.23 

— 

Reward  (R) 

.47 

2 

.23 

.07 

— 

R  x  T 

28.17 

2 

14.08 

4.32 

.02 

Sub j .  w.  groups 

176.19 

54 

3.26 

Days  (D) 

338 . 27 

11 

30.75 

31.36 

.001 

D  x  T 

21.31 

11 

1.94 

1.98 

.03 

D  x  R 

38.09 

22 

1.73 

1.77 

.02 

D  x  T  x  R 

18.73 

22 

.  8  5 

.87 

— 

D  x  Sub j .  w.  groups 

582.51 

594 

.98 
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TABLE  9 


Means  and  standard  deviations  of  errors  in  extinction 


Magnitude 

of  Reward 

2 

4 

8 

Groups 

X 

SD 

X 

SD 

X 

SD 

X  SD 

Acquisition  40 

2.16 

1.39 

1.86 

1.27 

1.71 

1.29 

1.91  1.33 

Acquisition  100 

1.60 

1.18 

1.83 

1.15 

2.11 

1.37 

1.85  1.25 

X 

1.88 

1.32 

1.85 

1.21 

1.91 

1.34 

MEAN  DAILY  ERRORS 
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FIG.  5.  Mean  daily  errors  in  extinction  for  three 
reward  magnitudes  as  a  function  of  level  of  training. 


MEAN  DAILY  ERRORS 
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FIG.  6.  Mean  daily  errors  in  extinction  for  two 
levels  of  acquisition  training. 


MEAN  DAILY  ERRORS 
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DAYS  (5  TRIAL  BLOCKS) 

FIG.  7.  Mean  daily  errors  in  extinction  for  three 
reward  magnitudes. 
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can  be  represented  as  a  progression  toward  equal  response 
tendencies  to  both  the  negative  and  positive  stimulus. 

The  rate  at  which  this  equalization  occurs  gives  an  index 
of  rate  of  extinction. 

Reveral .  The  only  analysis  of  latencies  in  reversal 
in  which  there  were  significant  effects  other  than  for  days 
was  for  starting  latencies  for  the  six  groups  with  an 
extinction  period.  Only  the  first  two  days  of  reversal 
training  were  used  in  these  analyses,  since  animals  began 
reaching  criterion  starting  on  day  2.  A  repeated  measure 
analysis  of  variance  indicated  the  significant  effects  of 
days  (p<.001),  training  (p<.04),  and  days  x  training  (p<\02) 

(see  Table  10) .  Figure  8  shows  that  the  starting  latencies 
decreased  over  the  first  two  days  of  reversal  training.  The 
latencies  for  the  overtrained  groups  decreased  faster,  although 
they  reached  the  same  level  as  the  non-overtrained  group. 

An  analysis  of  trials  to  reversal  criterion  of  the  six  extinct¬ 
ion  groups  is  shown  in  Table  11;  Table  13,  showing  the  means 
and  standard  deviations  for  this  same  measure,  indicates  that  no 
ORE  occurred  in  the  six  extinction  groups.  Overall,  groups 
with  100  acquisition  trials  performed  significantly  poorer  in 
reversal  (p<. 001)  than  groups  with  40  acquisition  trials.  A 
significant  reward  effect  (p<.  001)  indicated  that,  overall, 
large  and  medium  reward  magnitude  groups  performed  better  in 
reversal  than  small  reward  magnitude.  The  results  best 
describing  the  relationship  of  the  six  groups  in  the  analysis 
was  the  significant  interaction  (see  Figure  9)  between 
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TABLE  10 

Repeated  measures  analys.i  s  of  variance  of 
starting  latency  for  the  first  two  days  in 
reversal  of  six  groups  with  an  extinction 

period . 


Source 

SS 

df 

MS 

F 

P 

Training  (T) 

74.77 

1 

74.77 

4.30 

.04 

Reward  (R) 

62.86 

2 

31.43 

1.81 

— 

R  x  T 

7.90 

2 

3.95 

.23 

— 

Sub j .  w.  groups 

93.84 

54 

17.37 

Days  (D) 

283  o 11 

1 

283.11 

18.23 

.  001 

D  x  T 

92.12 

1 

92.11 

5.93 

.02 

D  x  R 

38.71 

2 

13.35 

1.25 

— 

D  x  T  x  R 

94.90 

2 

4.75 

.31 

— 

D  x  Subj .  w.  groups 

'  838.52 

54 

15.53 
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TABLE  11 

Analysis  of  variance  of  trials  to  reversal  criterion 
for  groups  with  interpolated  extinction. 


Source 

SS 

df 

MS 

F 

P 

Trials 

(T) 

2220.42 

1 

2220.42 

9.08 

.  001 

Reward 

(R) 

4997.50 

2 

2498.75 

10.22 

.001 

R  x  T 

3270.83 

2 

1635.42 

6.68 

.001 

Error 

13202.50 

54 

244.49 

Total 

23691.25 

59 

TABLE  12 

Analysis  of  variance  of  trials  to  reversal  criterion 
for  groups  40-8E,  40-8NE,  100-8E  and  100-8NE 

Source 

SS 

df 

MS 

F 

P 

Training  (T) 

2975.62 

1 

2975.62 

11.18 

.01 

Extinction  (E) 

5175.62 

1 

5175.62 

19.45 

.01 

E  x  T 

765.62 

1 

765.62 

2.88 

— 

Error 

9577.50 

36 

266.04 

Total 

18494.37 
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MEAN  STARTING  LATENCIES  (SECS.) 
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1 

REVERSAL  DAYS 


FIG 

reversal 


8.  Mean  starting  latency  for  two  days 
for  two  levels  of  acquisition  training. 


2 
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TABLE  13 

Means  and  standard  deviations  of  trials  to 
criterion  for  all  groups  in  reversal 


Extinction 

No  Ext. 

Magnitude  of  reward 

8 

2 

4 

8 

Acquisition 

40 

X 

39.0 

42.0 

37.0 

39.5 

71.5 

SD 

13.1 

14.4 

15.3 

13.7 

21.9 

Acquisition 

100 

X 

70.0 

31.0 

31.5 

44.1 

45.5 

SD 

23.7 

8.1 

15.1 

23.4 

12.1 

X 

54.5 

36.5 

39.7 

SD 

24.5 

15.3 

25.0 

X 

41.7 

58.5 

SD 

23.2 

21.7 

. 
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magnitude  of  reward  and  level  of  acquisition  training  (p<.01). 
Using  Scheffe’s  test  for  multiple  comparisons  it  was  found 
that  group  100-2E  differed  significantly  from  group  40-2E; 

(for  p<.05,  F=20.55,  F=30.076),  Group  100-2E  required  more 
trials  to  reach  the  reversal  criterion  than  group  40-2E. 

The  difference  in  the  number  of  trials  to  reach  the  reversal 
criterion  between  overtrained  and  nonovertrained  large  and 
medium  reward  groups  was  in  the  predicted  direction,  but 
did  not  reach  significance. 

An  analysis  of  variance  (See  Table  12  and  Figure  10) 
of  the  two  large  reward  groups  with  extinction  (40-8E  and 
100-8E)  and  without  extinction  (40-8NE  and  100-8NE) 
showed  that  an  interpolated  extinction  stage  decreased  the 
number  of  trials  to  reversal  (p<.01).  In  this  analysis  there 
was  also  a  significant  effect  due  to  level  of  training 
(p<.01);  the  overtrained  groups  (100-8E  and  100-8NE)  reversed 
faster  than  the  non-overtrained  groups  (40-8E  and  40-8NF.)  . 
Scheffe's  test  for  multiple  comparisons  showed  that  the  group 
100-8NE  differed  significantly  from  group  40-8NE  in  that 
overtraining  led  to  fewer  trials  to  criterion  (for  p<.05, 
F=12.33,  F=12 . 71 ) .  In  comparing  groups  100-8E  and  40-8E 
it  was  found  that  overtraining  led  to  fewer  trials  to 
criterion,  but  not  significantly  so. 


Pjq#  9.  Mean  trials  to  reversal  criterion  for  three 
reward  magnitudes  as  a  function  of  level  of  acquisition 
training  for  groups  with  interpolated  extinction. 
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40  100 

ACQUISITION  TRIALS 


FIG.  10.  He an  trials  to  reversal  criterion  of  large 
reward  groups  with  and  without  extinction  as  a  function 
of  level  of  acquisition  training. 
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Discussion 

The  major  findings  of  the  study  can  be  summarized  as 
follows : 

In  extinction 

1.  As  predicted,  groups  with  medium  and  large  magnitude 

of  reward  in  acquisition  extinguished  more  easily,  as  is  indicated 
by  larger  starting  latencies  in  extinction,  compared  to  small 
reward  groups.  The  predicted  similar  effect  of  magnitude 
of  reward  on  errors  in  extinction  did  not  occur.  Although 
not  predicted,  it  was  found  that  overtraining  led  to  larger 
starting  latencies  in  extinction. 

2.  The  predicted  interaction  between  magnitude  of  reward 
and  level  of  acquisition  training  occurred  with  error  scores, 
but  not  with  latencies  in  extinction.  Errors  increased  (i.e., 
faster  extinction)  with  overtraining  for  large  reward  groups, 
but  decreased  with  overtraining  for  small  reward  groups. 

In  reversal 

1.  Trials  to  criterion  were  an  inverse  function  of 
magnitude  of  reward,  as  was  predicted.  The  predicted  effect 
of  reward  magnitude  on  latencies  was  not  found. 

2.  The  predicted  interaction  between  magnitude  of  reward 
and  level  of  acquisition  training  on  trials  to  criterion  was 
significant,  but  the  form  of  the  interaction  was  not  as 
predicted.  For  medium  and  large  reward  groups  which  had  an 
interpolated  extinction  period  there  were  no  significant 
differences  due  to  the  level  of  acquisition  training,  although 
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the  direction  of  the  differences  was  as  predicted.  For  large 
reward  non-extinguished  groups,  the  predicted  significant 
ORE  was  obtained.  For  small  reward  extinguished  groups  a 
reverse  ORE  was  found.  The  predicted  interaction  for  latencies 
was  not  found. 

3.  As  predicted,  an  interpolated  extinction  period 
significantly  decreased  trials  to  criterion,  although  the 
predicted  effect  on  latencies  was  not  found. 

An  additional  finding  in  acquisition  was  that  groups 
with  larger  magnitude  of  reward  learned  the  brightness 
discrimination  faster. 

Acquisition  and  overtraining.  The  results  of  performance 
during  the  40  acquisition  trials  indicate  that  magnitude  of 
reward  affects  the  rate  of  discrimination  learning.  The  results 
agree  with  previous  findings,  such  as  those  of  Schrier  (1956), 
Pubols  (1961) ,  Lawson  et  al  (1959) ,  and  Clayton  and  Koplin 
(1964).  In  these  studies,  as  well  as  in  the  present  experiment, 
errors  were  in  inverse  function  of  magnitude  of  reward.  A 
further  conclusion  of  some  previous  studies  has  been  that 
asymptotic  performance  for  different  reward  groups  continued 
to  differ  beyond  a  performance  criterion.  In  the  present 
experiment  however,  performance  in  terms  of  errors  in  over¬ 
training  was  not  distinguishable  on  the  basis  of  amount  of 
reward.  At  the  end  of  the  40  acquisition  trials  all  groups 
had  reached  about  the  same  asymptote  of  performance  as  can  be 
seen  by  examining  Figure  2  and  by  the  nonsignificant  differences 
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between  magnitude  groups  in  the  analysis  of  errors  in  over¬ 
training.  Throughout  overtraining,  reward  magnitude  had  no 
significant  effect  on  performance.  A  possible  explanation 
of  this  discrepancy  between  previous  studies  and  the  present 
results  is  that  previous  studies  did  not  have  a  sufficient 
number  of  trials  and  therefore  did  not  reach  the  final  common 
asymptote  of  performance. 

The  acquisition  results  suggest  that  reversal  learning 
will  also  be  affected  by  the  magnitude  of  reward,  since 
reversal  can  be  considered  the  acquisition  of  new  learning. 

A  study  by  Kendler  and  Kimm  (1967)  however,  has  shown  that 
acquisition  magnitude  of  reward  is  more  effective  in 
determining  reversal  performance  than  is  the  magnitude  of 
reward  in  reversal.  In  the  Kendler  and  Kimm  study  (1967) 
magnitude  of  reward  in  reversal  was  varied  factorially  for 
each  reward  magnitude  in  acquisition.  The  reversal  results 
showed  that  while  reversal  magnitude  of  reward  did  affect 
the  number  of  trials  to  the  reversal  criterion,  the  major 
differences  in  reversal  performance  were  due  to  the  magnitude 
of  reward  in  acquisition. 

Extinction.  From  Theios  and  Blosser's  (1965a)  model, 
it  was  predicted  that  overtrained  £s  would  be  less  resistant 
to  extinction  than  nonovertrained  Ss  with  large  reward,  but 
not  with  small  reward.  The  results  support  this  prediction 
to  some  extent,  but  not  entirely.  The  analysis  of  errors 
in  extinction  showed  that  with  large  reward  overtrained  Ss 


50 


extinguished  faster  than  nonovertrained  S s .  With  small 
reward,  however,  overtraining  led  to  slower  extinction, 
whereas  Theios  and  Blosser's  model  would  predict  no  dif¬ 
ference.  The  results  of  the  small  reward  groups  are  then 
in  line  with  Mackintosh's  (1965a)  assertion  that  overtraining 
of  a  discriminative  task  leads  to  increased  resistance 
to  extinction,  when  resistance  to  extinction  is  measured  by 
perserverative  errors .  With  reference  to  the  two  theoretical 
analyses  of  the  ORE,  the  extinction  results  for  large 
reward  groups  support  Theios  and  Blosser's  model  and  for 
small  reward  groups  support  Mackintosh's  model. 

An  unexpected  finding  of  the  error  data  in  extinction 
was  that  among  the  nonovertrained  groups  the  small  reward 
group  extinguished  faster  than  the  medium  and  large  reward 
groups.  A  possible  explanation  for  this  finding  is  that  the 
small  reward  group  did  not  establish  a  strong  response  to  the 
positive  stimulus  in  acquisition  and  therefore  extinguished 
this  response  more  easily  than  the  larger  reward  magnitude 
groups.  The  acquisition  curves  for  the  three  reward  magnitudes 
showed  that  while  the  large  and  medium  reward  groups  reached 
asymptotic  performance  at  least  on  day  7,  the  small  reward 
groups  did  not  quite  reach  the  same  asymptotic  performance  on 
day  8,  the  last  day  of  acquisition. 

Starting  and  running  latencies  in  extinction  did  not 
show  the  predicted  interaction  effect  of  acquisition  magnitude 
of  reward  and  amount  of  acquisition  training.  For  starting 
latencies  all  main  effects  were  significant;  for  large  reward 
groups  overtraining  led  to  larger  latencies  (i.e.,  faster 
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extinction) .  The  starting  latency  results  are  in  line 
with  previous  studies  which  found  that  large  reward  led 
to  larger  latencies  in  extinction  (Hulse,  1958),  and  that 
overtraining  also  led  to  larger  latencies  (Wagner,  1963; 
Mackintosh,  1963) .  Starting  latency  in  the  present 
experiment  was  a  more  sensitive  measure  in  extinction 
than  running  latency,  which  did  not  yield  any  treatment 
effects;  the  former  showed  predicted  group  differences, 
while  running  latency  did  not.  Running  latency  in  the  present 
experiment  included  time  spent  at  the  choice  point,  which  may 
have  cancelled  groups  differences  in  running  speed  in  the 
stem  of  the  maze. 

It  has  been  the  practice  in  discrimination  experiments 
to  equate  the  measures  of  errors  and  latencies  (Birch,  1955) , 
although  time  measures  have  been  regarded  as  being  more 
susceptible  to  motivational  variables  than  choice  responses 
(Hillman,  Hunter  &  Kimble,  1953).  The  results  of  this  study 
suggest  that  in  discrimination  learning  and  extinction  the 
two  kinds  of  response  measures  are  not  necessarily  similar. 
Mackintosh  (1963)  has  also  reported  a  discrepancy  between 
the  two  kinds  of  response  measures  in  the  extinction  of  a  reversal 
habit.  He  found  that  overtraining  led  to  more  trials  to  an 
extinction  criterion  of  choice  responses,  but  to  faster 
extinction  in  terms  of  latencies. 

Reversal .  The  only  significant  effect  for  latency 
measures  in  reversal  was  due  to  acquisition  training  level. 
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The  latency  results  in  reversal  agree  with  those  in 
acquisition,  in  which  also  no  effect  due  to  magnitude  of 
reward  was  found.  The  results  of  trials  to  criterion 
indicated  that  the  only  significant  difference  between 
the  40  and  100  trial  groups  of  equal  reward  was  in  the 
small  reward  groups;  a  reverse  ORE  was  found  with  small 
reward  magnitude.  This  finding  agrees  with  two  previous 
studies  (D'Amato  &  Schiff,  1965;  Kendler  &  Kimm,  1967) 
in  which  small  reward  led  to  a  reverse  ORE.  The  conclusion 
to  be  drawn  is  that  overtraining  with  a  small  reward 
magnitude  leads  to  a  reverse  ORE,  rather  than  no  difference 
between  criterion  and  overtrained  Ss  as  predicted  by  the 
Theios  and  Blosser  model.  The  unexpected  performance 
of  the  100-2E  group  is  anticipated  by  its  extinction 
performance.  Its  running  and  starting  latencies  as  well 
as  its  error  scores  were  the  lowest  of  all  groups  indicating 
its  great  resistance  to  extinction.  Together,  these 
results  suggest  that  reversal  learning  with  small  reward  is 
slow  because  of  the  great  resistance  to  extinction  of  the 
original  learning. 

For  the  groups  with  an  interpolated  extinction  period 
there  was  no  significant  difference  between  groups  40-4E 
and  100-4E,  and  40-8E  and  100-8E,  although  the  mean  differences 
were  in  the  predicted  ORE  direction.  An  ORE,  however,  was 
found  between  groups  40-8NE  and  100-  8NE,  which  did  not  have 
an  interpolated  extinction  period.  This  finding  provides 
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strong  support  for  the  extinction  explanation  of  the  ORE 
since,  for  these  large  reward  groups,  a  significant  effect 
due  to  overtraining  was  found  in  extinction,  but  not  in 
reversal,  for  groups  with  an  extinction  period;  and  for 
groups  without  an  extinction  period  an  overtraining  effect 
was  found  in  reversal.  This  indicates  that  under  conditions 
that  normally  produce  the  ORE,  the  effect  can  be  reduced 
by  an  interpolated  extinction  phase.  In  terms  of  an  extinction 
account  of  the  ORE,  this  means  that  the  differential  resistance 
to  extinction  of  the  groups,  as  a  function  of  differential 
acquisition  treatment,  has  been  equalized  to  the  extent  that 
its  effect  no  longer  leads  to  a  significant  ORE„ 

In  extinction,  large  reward  overtrained  Ss  extinguished 
faster  than  large  reward  nonovertrained  Ss,  and  the  over¬ 
trained  Ss  also  reversed  faster,  although  not  significantly; 
statistical  significance  was  not  reached  because  of  the 
equalizing  effect  of  the  interpolated  extinction  period. 

Together  with  the  extinction  performance  of  the  small  reward 
groups,  these  results  provide  support  for  the  predicted 
similarity  of  performance  in  extinction  and  reversal. 

In  conclusion,  the  results  of  the  present  experiment 
indicate  that  1)  speed  of  reversal  learning  is  related  to 
the  level  of  extinction  of  the  original  learning,  2) 
reversal  performance  can  be  predicted  to  some  extent  from 
extinction  performance  and  3)  that  extinction  of  original 
learning  is  an  important  process  in  reversal  learning. 

Since  the  relationship  between  extinction  performance  and 
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reversal  performance  only  partially  predicts  reversal 
performance,  the  results  further  indicate  that  the  level 
of  extinction  of  original  learning  is  not  the  only 
determinant  of  reversal  performance.  The  acquisition 
data  suggest  that  one  further  variable  determining  reversal 
performance  is  the  magnitude  of  reward  in  reversal. 

The  results  of  the  present  experiment  favour  extinction 
hypotheses  of  the  ORE  and  cast  some  doubt  on  the  attention 
model  analysis  of  the  ORE.  Two  conditions  that  have  been 
shown  to  control  the  ORE  within  the  framework  of  the 
extinction  hypothesis  are  the  magnitude  of  reward  and  the 
level  of  extinction  of  the  original  learning.  Further,  the 
effect  of  reward  magnitude  on  reversal  learning  appears  to 
act  primarily  through  its  effect  on  the  extinction  of  original 
learning.  A  complete  analysis  of  the  ORE  thus  needs  to 
take  into  account  the  effect  of  differential  extinction  of 
original  learning  on  reversal  learning. 
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